21 research outputs found

    nsroot: Minimalist Process Isolation Tool Implemented With Linux Namespaces

    Get PDF
    Data analyses in the life sciences are moving from tools run on a personal computer to services run on large computing platforms. This creates a need to package tools and dependencies for easy installation, configuration and deployment on distributed platforms. In addition, for secure execution there is a need for process isolation on a shared platform. Existing virtual machine and container technologies are often more complex than traditional Unix utilities, like chroot, and often require root privileges in order to set up or use. This is especially challenging on HPC systems where users typically do not have root access. We therefore present nsroot, a lightweight Linux namespaces based process isolation tool. It allows restricting the runtime environment of data analysis tools that may not have been designed with security as a top priority, in order to reduce the risk and consequences of security breaches, without requiring any special privileges. The codebase of nsroot is small, and it provides a command line interface similar to chroot. It can be used on all Linux kernels that implement user namespaces. In addition, we propose combining nsroot with the AppImage format for secure execution of packaged applications. nsroot is open sourced and available at: https://github.com/uit-no/nsroo

    nsroot: Minimalist process isolation tool implemented with Linux namespaces

    Get PDF
    services run on large computing platforms.. This creates a need to package tools and dependencies for easy installation,, configuration and deployment on distributed platforms.. In addition,, for secure execution there is a need for process isolation on a shared platform.. Existing virtual machine and container technologies are often more complex than trad itional Unix utilities,, like chroot,, and often require root privileges in order to set up or use.. This is especially challenging on HPC systems where users typically do not have root access.. We therefore present nsroot,, a lightweight Linux namespaces based process isolation tool.. It allows restricting the runtime environment of data analysis tools that may not have been designed with security as a top priority,, in order to reduce the risk and consequences of security breaches,, without requiring any special privileges.. The codebase of nsroot is small,, and it provides a command line interface similar to chroot.. It can be used on all Linux kernels that implement user namespaces.. In addition,, we propose combining nsroot with the AppImage format for secure execu tion of packaged applications.. nsroot is open sourced and available at:: https://github.com/uit-no/nsroot

    Kvik: Interactive exploration of genomic data from the NOWAC postgenome biobank

    Get PDF
    We have developed Kvik, a system for interactive exploration of genomicdata from the Norwegian Women and Cancer (NOWAC) postgenomebiobank. The goal of the NOWAC study is to understand the dynamicsof carcinogenesis through multi-level functional analyses of transcriptomicsand epigenetics using blood and tissue samples. Kvik provides a tool forexploring gene expression data, incorporating both statistical analysis andinteractive visualizations in a single system. The tool is open-sourced atgithub.com/fjukstad/kvik

    Teaching Electronics and Programming in Norwegian Schools Using the air:bit Sensor Kit

    Full text link
    We describe lessons learned from using the air:bit project to introduce more than 150 students in the Norwegian upper secondary school to computer programming, engineering and environmental sciences. In the air:bit project, students build and code a portable air quality sensor kits, and use their air:bit to collect data to investigate patterns in air quality in their local environment. When the project ended students had collected more than 400,000 measurements with their air:bit kits, and could describe local patterns in air quality. Students participate in all parts of the project, from soldering components and programming the sensors, to analyzing the air quality measurements. We conducted a survey after the project and describe our lessons learned from the project. The results show that the project successfully taught the students fundamental concepts in computer programming, electronics, and the scientific method. In addition, all the participating teachers reported that their students had showed good learning outcomes

    Transcription factor PAX6 as a novel prognostic factor and putative tumour suppressor in non-small cell lung cancer

    Get PDF
    Source at https://doi.org/10.1038/s41598-018-23417-z. Licensed CC BY-NC-ND 4.0.Lung cancer is the leading cause of cancer deaths. Novel predictive biomarkers are needed to improve treatment selection and more accurate prognostication. PAX6 is a transcription factor with a proposed tumour suppressor function. Immunohistochemical staining was performed on tissue microarrays from 335 non-small cell lung cancer (NSCLC) patients for PAX6. Multivariate analyses of clinico-pathological variables and disease-specific survival (DSS) was carried out, and phenotypic changes of two NSCLC cell lines with knockdown of PAX6 were characterized. While PAX6 expression was only associated with a trend of better disease-specific survival (DSS) (p = 0.10), the pN+ subgroup (N = 103) showed significant correlation between high PAX6 expression and longer DSS (p = 0.022). Median survival for pN + patients with high PAX6 expression was 127.4 months, versus 22.9 months for patients with low PAX6 expression. In NCI-H661 cells, knockdown of PAX6 strongly activated serum-stimulated migration. In NCI-H460 cells, PAX6 knockdown activated anchorage-independent growth. We did not observe any significant effect of PAX6 on proliferation in either of cell lines. Our findings strongly support the proposition of PAX6 as a valid and positive prognostic marker in NSCLC in node-positive patients. There is a need for further studies, which should provide mechanistical explanation for the role of PAX6 in NSCLC

    Kvik : interactive exploration of genomic data from the NOWAC postgenome biobank

    Get PDF
    Recent technological advances provide large amounts of data for epidemiological analyses that can provide novel insights in the dynamics of carcinogenesis. These analyses are often performed without prior hypothesis and therefore require an exploratory approach. Realizing exploratory analysis requires the development of new systems that provide interactive exploration and visualization of large-scale scientific datasets. This thesis presents Kvik, an interactive system for exploring the dynamics of carcinogenesis through integrated studies of biological pathways and genomic data. Kvik is designed as a three-tiered application, an architecture that is commonly used for peta-scale applications. It provides researchers with a lightweight web application for navigating through biological pathways from the KEGG database integrated with genomic data from the NOWAC postgenome biobank. In collaboration with researchers from the NOWAC systems epidemiology group, we have described the requirements for such a system, and by using an iterative approach we implemented Kvik through small development cycles, involving the end-users in the development process. Throughout the project we have gained valuable interdisciplinary experience in developing systems for use in explorative analysis of carcinogenesis. Through an evaluation of the exploration tasks and workflow of an end-user, we demonstrate that Kvik has the capability of interactive exploration of genomic data and biological pathways. We believe Kvik is important to enable novel discoveries from the data produced in the NOWAC systems epidemiology project. It provides epidemiology researchers with access to powerful compute and storage resources enabling the use of advanced statistical methods for the analysis. Finally, from our experiences in developing Kvik, we provide use cases and requirements for future analysis, computation and storage systems developed in our research group and by others

    Toward Reproducible Analysis and Exploration of High-Throughput Biological Datasets

    Get PDF
    This dissertation argues that we can develop unified systems for reproducible exploration and analysis of high-throughput biological datasets. We propose an approach, Small Modular Entities (SME), that orchestrates the execution of analysis pipelines and data exploration applications. We realize SMEs using software container technologies together with well-defined interfaces, configuration, and orchestration. It simplifies the development of such applications, and provides detailed information to reproduce the analyses

    A Review of Scalable Bioinformatics Pipelines

    No full text
    Abstract Scalability is increasingly important for bioinformatics analysis services, since these must handle larger datasets, more jobs, and more users. The pipelines used to implement analyses must therefore scale with respect to the resources on a single compute node, the number of nodes on a cluster, and also to cost-performance. Here, we survey several scalable bioinformatics pipelines and compare their design and their use of underlying frameworks and infrastructures. We also discuss current trends for bioinformatics pipeline development

    Reproducible Data Analysis Pipelines for Precision Medicine

    No full text
    Precision medicine brings the promise of more precise diagnosis and individualized therapeutic strategies from analyzing a cancer’s genomic signature. Technologies such as high-throughput sequencing enable cheaper data collection at higher speed, but rely on modern data analysis platforms to extract knowledge from these high dimensional datasets. Since this is a rapidly advancing field, new diagnoses and therapies often require tailoring of the analysis. These pipelines are therefore developed iteratively, continuously modifying analysis parameters before arriving at the final results. To enable reproducible results it is important to record all these modifications and decisions made during the analysis process."/jats:p""jats:p"We built a system, "jats:monospace"walrus"/jats:monospace", to support reproducible analyses for iteratively developed analysis pipelines. The approach is based on our experiences developing and using deep analysis pipelines to provide insights and recommendations for treatment in an actual breast cancer case. We designed "jats:monospace"walrus"/jats:monospace" for the single servers or small compute clusters typically available for novel treatments in the clinical setting. "jats:monospace"walrus"/jats:monospace" leverages software containers to provide reproducible execution environments, and integrates with modern version control systems to capture provenance of data and pipeline parameters."/jats:p""jats:p"We have used "jats:monospace"walrus"/jats:monospace" to analyze a patient’s primary tumor and adjacent normal tissue, including subsequent metastatic lesions. Although we have used "jats:monospace"walrus"/jats:monospace" for specialized analyses of whole-exome sequencing datasets, it is a general data analysis tool that can be applied in a variety of scientific disciplines. We have open sourced "jats:monospace"walrus"/jats:monospace" along with example data analysis pipelines at "jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/uit-bdps/walrus""github.com/uit-bdps/walrus."/jats:ext-lin

    Kvik: three-tier data exploration tools for flexible analysis of genomic data in epidemiological studies

    Get PDF
    Published version. Source at https://doi.org/10.12688/f1000research.6238.1.Kvik is an open-source system that we developed for explorative analysis of functional genomics data from large epidemiological studies. Creating such studies requires a significant amount of time and resources. It is therefore usual to reuse the data from one study for several research projects. Often each project requires implementing new analysis code, integration with specific knowledge bases, and specific visualizations. Existing data exploration tools do not provide all the required functionality for such multi-study data exploration. We have therefore developed the Kvik framework which makes it easy to implement specialized data exploration tools for specific projects. Applications in Kvik follow the three-tier architecture commonly used in web applications, with REST interfaces between the tiers. This makes it easy to adapt the applications to new statistical analyses, metadata, and visualizations. Kvik uses R to perform on-demand data analyses when researchers explore the data. In this note, we describe how we used Kvik to develop the Kvik Pathways application to explore gene expression data from healthy women with high and low plasma ratios of essential fatty acids using biological pathway visualizations. Researchers interact with Kvik Pathways through a web application that uses the JavaScript libraries Cytoscape.js and D3. We use Docker containers to make deployment of Kvik Pathways simple
    corecore